JUFIT: A Configurable Rule Engine for Filtering and Generating New Multilingual UMLS Terms

نویسندگان

  • Johannes Hellrich
  • Stefan Schulz
  • Sven Buechel
  • Udo Hahn
چکیده

We here describe JuFiT, an easily adjustable rule engine which allows to filter non-natural terms (i.e., ones usually not occurring in running citation texts) from the Umls metathesaurus and even adds new terms to the UMLS (by rewriting non-natural terms). Unlike previous attempts (with MetaMap or Casper), JuFiT serves multilingual purposes in that it runs for English, Spanish, French, German and Dutch documents, as well - the most prominent European languages in terms of UMLS coverage. We evaluated JuFiT under a variety of experimental conditions and found evidence that it increases annotation quality for English, and most likely also for German and Spanish.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Unsupervised Disambiguation for a Multilingual Medical Information System using UMLS

This paper describes techniques for unsupervised word sense disambiguation of English and German medical documents using the Unified Medical Language System (UMLS). We present both monolingual techniques which rely only on the structure of UMLS, and bilingual techniques which also rely on the availability of parallel corpora. The best results are obtained using relationships between terms given...

متن کامل

Quantifying the Impact and Extent of Undocumented Biomedical Synonymy Supporting Information

Consistent with previous observations [1, 2, 3], we noticed that many of the terms contained within the UMLS Metathesaurus were inappropriate for natural language-oriented analyses (ex: database-specific encodings, machine permutations, non-English language entries, etc.). Therefore, prior to generating the terminologies utilized in this study, we subjected the Metathesaurus to a thorough, rule...

متن کامل

Multilingual Ontology Enrichment for Semantic Annotation and Retrieval of Medical Information

Background: Knowledge management in the European project Noesis addresses concept-based annotation and multilingual Information Retrieval of documents. Objective: Multilingual enrichment of a concept-based terminology in the medical field. Experience and evaluation in the domain of cardiovascular diseases by enriching a subset of the MeSH thesaurus in six European languages. This terminology, r...

متن کامل

IndexFinder: A Method of Extracting Key Concepts from Clinical Texts for Indexing

Extracting key concepts from clinical texts for indexing is an important task in implementing a medical digital library. Several methods are proposed for mapping free text into standard terms defined by the Unified Medical Language System (UMLS). For example, natural language processing techniques are used to map identified noun phrases into concepts. They are, however, not appropriate for real...

متن کامل

Use of Semantic Similarity and Web Usage Mining to Alleviate the Drawbacks of User-Based Collaborative Filtering Recommender Systems

  One of the most famous methods for recommendation is user-based Collaborative Filtering (CF). This system compares active user’s items rating with historical rating records of other users to find similar users and recommending items which seems interesting to these similar users and have not been rated by the active user. As a way of computing recommendations, the ultimate goal of the user-ba...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • AMIA ... Annual Symposium proceedings. AMIA Symposium

دوره 2015  شماره 

صفحات  -

تاریخ انتشار 2015